# MoE Architecture

Qwen3 30B A3B Llamafile
Apache-2.0
Qwen3 is the latest generation of large language models in the Qwen series, offering a range of dense and mixture-of-experts (MoE) models. Based on extensive training, Qwen3 has achieved groundbreaking progress in reasoning, instruction following, agent capabilities, and multilingual support.
Large Language Model
Q
Mozilla
143
1
Qwen3 14B Base Unsloth Bnb 4bit
Apache-2.0
Qwen3-14B-Base is the latest generation large language model in the Qwen series, offering a dense model with 14.8 billion parameters, supporting a context length of 32k and covering 119 languages.
Large Language Model Transformers
Q
unsloth
2,120
1
Qwen3 4B Base
Apache-2.0
Qwen3-4B-Base is the latest generation of the Qwen series with 4 billion parameters, supporting 32k context length and multilingual processing.
Large Language Model Transformers
Q
unsloth
15.15k
1
Qwen3 14B Base
Apache-2.0
The latest generation large language model in the Qwen series, offering a 14.8 billion parameter pre-trained base model with support for 32k ultra-long context understanding
Large Language Model Transformers
Q
Qwen
9,718
21
Deepseek R1 Zero
MIT
DeepSeek-R1 is the first-generation reasoning model developed by DeepSeek, trained through reinforcement learning, excelling in mathematics, coding, and reasoning tasks.
Large Language Model Transformers
D
deepseek-ai
4,034
905
Granite 3.1 1b A400m Base
Apache-2.0
Granite-3.1-1B-A400M-Base is a language model developed by IBM. Through a progressive training strategy, the context length is extended from 4K to 128K, supporting multilingual and various text processing tasks.
Large Language Model Transformers
G
ibm-granite
3,299
9
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase